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TOWARDS THE UNDERSTANDING 
OF NATURAL 1ANGUA(XS BY MACHINES 

J, A. Moyne 



AESnUVCT 



7‘t present a ocnfxiter SYSten? cannot be oonstructed 
for handling the totality of a natural language in any 
significant way. It is, hoinever, possible to construct 
a system for coramiiication in a narrow field of discourse. 

A working model for a specialized discourse based on a 
recognition Grammar is discussed. Sane properties of the 
recognition granmcu:, which is based on transfozmational 
theory, are outlined. 

Ihis paper was read at the International Congress of 
Linguists at Bucharest, August 28 - Septenher 2, 1967. 

It is being published in the proceedings of the Congress. 

Ihis paper is, therefore, not available for public ^stribution. 
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TOKARUS HIE UNEOeTANDITX; OF 



ri/mjRAL IJVNGUflGES BY M/OilNES* 

J. A. rioyne 

Inti^mational Business Machines Corporation 
1. Reooqnition Gramnars 

A oenerative transformational granmar nas three parts: (1) A base cxm- 

ponent that generates deep structures . Hhe deep structure is an abstract and 
oonplex object that carries the meaning of the sentence. (2) Transformational 
rules that apply to the deep structme and produce a surface structure similar 
to a traditional parsing. (3) Phcxiological rules that apply to the surface 
structure and produce a phonological representation. 

For a computer to “understand" a natural-language sentence, the generative 
graimar must be reversed: a surface phrase— structure grammar parses the input 

sentence producing a surface structure; reverse transformations apply to the 
surface structure and produce a deep structure (cf. 4.); and, finally, semantic 
rules “interpret" the deep structure as actions to be perfonned by the machine. 
Sixh a reverse grcimar is called a recognition grammar . 



*For discussions of the theoretical foundations upon vhich this 
work is based see Noam Chomsky, Aspects of the Theory of Synt^ , Hie MTT Press, 
CcBnbridge, 1965; J. A. Fodor and J. J. Katz (eds.) , Structure o f Languagp; 
Readings in the Ihilosophy of Language, Englei^ood Cliffs, N. J. , Prentioe-llall 
1964; Jerrold J. and Paul M. Postal, An Integrated Theory of Linguistic 
Descriptions, The MIT Press, Cambridge, 1964; and ^ 

the above works. I am grateful, among others, to G. Carden, R. Carte r ^d 
N. Rochester for discussions and many editorial improvements on this pa^r. 

D. B. Loveman designed the oaiputer programming language which is used in 
developing the system described in this paper. 
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•mere has teen a oieat deal of v»i* on generative transformational qramnars, 
hut very little on recognition granmars. In this paper I shall discuss sane 
hypotheses about recognition graranars and describe a working conputer system 
that “vnderstands" English within a limited universe of discourse. 

2. PixJto-HEIADES 

Since a recognition graimar faces the same oisolved problems as a genera- 
tive granmar, I do not think that ^ can., as yet, build a cognuter system to 
process and anJerstand the whole of a natural language. We can, hotever, build 
a practical system to handle inquiries about a given s*ject or data_^. 

Proto-RELRDfS is such a system. It uses an IH1 .System/360 ccnputer to 
ocmnunicate with a library in Ibglish, but its primary purpose is to experiment 
vdth cocnunication vdth ccroputers in natural languages. Since we hope to 
expand the system to handle other data bases and more sentence patterns, we 
have attenpted to make both the control system and tl« gresmar as general as 
possible, avoiding ad tec solutions even at the cost of inefficiency within 
the library data base. We believe the present system can be eigunded within 
the limitaticxis of current linguistic theories. 

Piotc^FELftDES has a small dictionary, a transformational recognition 
■ granmar, and a sanantic interpreter controlling computer operations. If the 
input sentence is "Give se the list of any bodes you have about granmar," the 
systan will analyze and "understand" the sentence, and s^iply the Us., requested. 

The programs that operate the dio^onary and granmar are independent of 
any particular granmar, data base, or language, ihus a recognition granmar of 
any language could be plugged into the systan with little or no d-<ange in pro- 
granming. Alternately, the existing English gramar could analyze sentences 
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about other sii>jects-"»*at can you tell me about the current Middle-tast dis- 
pute?" or "Solve the following equations:..." To get the desired response to 
such sentences, must supply the dictionary vdth any missing words and 
give the oonputer programs and data to carry out the necessary ooimands. Proto- 
raiAOS can also te reversed to test transformational or phonological rules, 
accepting any deep structure and set of rules as input and producing a surface 

structure. 



3. 'ttie Proto-RELADES Granmar 

The recognition granrar in Prcto-RFMES is a reversed transformational 
graninar with four components: lexicon, surface oratimar, deep oramiar, and 

semantics. 



The lexicon contains tJie vocabulary of the discourse about the data base, 
in this case library operations. It also determines syntactic category from 
context and replaces idicms with single-word synonyms: "having to do with" = 

"concerning. " 



The output of the lexicon is the bottom two lines of the surface structure 
tree; example: 

input Sentence: Have any bodes about gramnars been written? 



TNS HAVEAKrAnJj^ARTHU N P ART IW N TNS BE TOS VT 3 Q 

PFST have 0 any 0 PL bode about 0 PL gratmar EM be PAST write ? 

Figure 1 

-n-ie surface granmar is an inverse phrase structure graimar with rules of 
the forai R: A «- Y (R is a rule label, A a single element, Y a string) : 

Rule 10: NP< DET NU N 

context-sensitivity is achieved indirectly by letting rules call other rules: 
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tiie CS rule Y //X wu2d be vjritten in ti-.t? steps as: 

Rule Hi: Rule 5*j — XYZ 

Rule Fj: — Y 

If Y is in the context X _ Z rule Ni applies, rule Sj is called and 

rewrites Y as A. Rule S(i+1) can then return control to rule N(i+1) • If the 
context is not satisfied, rule Sj v.lll never apply. 

•R-.C surface qratirar is divided into partitions in -.diidi rules apply cyclicly 
to their own output. Vvten no raore rules can apply in one partition, control 
passes to the next. At tlie end of the last partition, control returns to the 
first rule of the first partition. This double-cycle ordering saves connuter 

time " 2 nd prevents unnecessciry I locked analyses. 



The output of tne surface arannar is 



the surface structure tree; exanple; 



c" 




Fimire 2 

(this and otlier figures in this paj^r have been slightly modified from actual 
cx 3 inputcr outputs for the sake of simplicity.) 
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*Ilie daep granniar is a set of ordered trans f ormaticnal rules. Each rule 
has two parts: a structural descriptiai defining the surface or intermediate 

structure subject to the rule, and ocnputer instructions that make the desired 
dianges. If a rule dc«s not apply, the next rule in the sequence is tried; if 
a rule applies, oontrol may be transferred to the next rule, back to the rule 
just applied (for an iterative rule) , back to an earlier rule (to re-apply a 
of niles) , or on to a later rule (to skip over ruies that will not 
apply) . This freedom in ordering saves oomputer time. all the applicdale 

transformations have applied, the resulting tree is the deep structure of the 
input sentence: 
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In Proto-RELADES all deep structures are based on this fonr.; 



S 




Figure 4 

Another sentence (S) vd.th the same structure may be enbedded under each 
KP, and so on inctefinitely. Unlimited recursion cannot lead to problems in a 
recognition gramnar because the depth of embedding is controlled by the input 

sentence. 

It is interesting that this simple deep structure is adec[uate for semantic 
interpretation without appeal to hypothetical verbs or iitplausible entoeddings. 
Such a level, intemediate between surface struerture and the hi^ly abstract 
output of a generative semantics or other base ocnponents, seems to be 
necessary for conputerized recognition granmars, and might well be useful in a 

model of perception or learning. 

4. Ambiguity Problems 

CXnputers are notorious for finding airbiguities in the simplest sentences; 
Proto-FEIADES solves this problem by its restricted data base and discourse. 

• Lexical aribiguities are resolved by giving words only the meaning pertinent to 

the data base, and some structural ambiguities can be handled similarly: List 

books on oonputers in the Ubrary”, for our library system, can only mean books 
in the library about oonputers, not bodks on top of computers located in the 
library. Data-base restrictions therefore permit the first analysis and reject 
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Sare sentences remain airbiguous even within the restricted data base; we 
propose to resolve these anbiguities by conversation between man and machine. 

If the input is: "List the dociments about books in the library." the ocmputer 
produces tlie possible relevant analyses and asks tlv?. user, "Do you mean dociments 
about the library's bodes or dociments in the library about books." The user 
answers and the machine executes the approved analysis. This device is not yet 

working. 

Hiis system, with a restricted data base and man-machine conversation, is 
le ss anbitious than programs that permit all possible analyses and try to select 
the relevant one. Our model may be closer to hunan analysis, vhidi is also 
cxjntrolled by context and resolves anbiguities by questioning. 

5. Semantics 

The semantic ooiTponent of Proto-RELAEES ocxisists of ordered transformaaonal 
rules that apply to a deep structure and produce an executable statement: 
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Ohe system then calls an operational program to execute that statement. The 
semantic transfonnaticxis depend on the data base, but the control program 
that applies them is corpletely independent. 

Wfe originally intended to write a separate operational program to represent 
the meaning of each of the dozens of verbs in the Proto-RELADES lexicon, and 
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have a orntrol ptognm to select %*ich operational ptogtwB applied and to 
arrange than in the proper order. We think this is more practical than the 
usual method of translating the analysis of the EngUsh isput into an artificial 

language. 

Tb our surprise we fcmd that we neuted only two programs as sanantic 
primitives for the whole library lexioon. The stative program prints *N docu- 
were found," the non-stat).ve program prints a list. Verbs that ask 
the library has a certain docaaent are [+stativej. AU other verbs 
in this systan turn out to be requests for information from the liteary catalog 
and are (-stativel . Normally [-stative) veibs like "write* (*Wn.te the 
list of...") will call the stative program in appropriate context ("Are there 

aiv books written about computers?*) . 

Obviously nore conplex data base vould require more operational pro- 
grams, but we are convinced that a powerful system with many applications 
could be built with a reasonable rainber of operating programs. 

Our experience with Proto-BELACES thus suggests that seroanUc primitives 
nuy not be atonistic markers but rather oorplex and pcwerfol entities like our 
stative and non-stative programs. Pero^Jtion would then work by opiating words 
with these oarplex primitives throu^ transformational rules in the semantic 
oonponent. We feel that this speculation merits further investigation. 
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